fix: add deepseek-v4 models, fix calculate_cost, and improve error ha… by Comui520 · Pull Request #2736 · confident-ai/deepeval

Comui520 · 2026-06-09T08:19:55Z

Summary

Register deepseek-v4-flash and deepseek-v4-pro in DEEPSEEK_MODELS_DATA
Fix calculate_cost() across all 8 providers to never return None
Fix ContextGenerator and Synthesizer to surface errors instead of silently returning empty results

Details

Problem

When pricing data is unknown (model not in registry), calculate_cost() returned None
across ALL providers (DeepSeek, Anthropic, OpenAI, Azure, Gemini, Kimi, Grok, Bedrock),
causing TypeError in ContextGenerator.evaluate_chunk() (total_cost += None).
This error was caught by a broad except Exception and logged but never surfaced,
resulting in empty goldens with no error to the user.

Changes

All 8 provider models — calculate_cost() now returns EvaluationCost(0.0, ...)
when prices are unknown, instead of None. Fixes the Optional[float] return type
to be consistently float.

constants.py — Added model definitions for deepseek-v4-flash and deepseek-v4-pro.

context_generator.py — generate_contexts() and a_generate_contexts() now raise
DeepEvalError when ALL documents fail, instead of silently returning empty contexts.

synthesizer.py — generate_goldens_from_docs() and a_generate_goldens_from_docs()
now raise DeepEvalError when contexts are empty.

Why this matters now

Per DeepSeek API docs, deepseek-chat and
deepseek-reasoner will be deprecated on 2026-07-24 in favor of deepseek-v4-flash
and deepseek-v4-pro.

vercel · 2026-06-09T08:19:59Z

@Comui520 is attempting to deploy a commit to the Confident AI Team on Vercel.

A member of the Team first needs to authorize it.

penguine-ip · 2026-06-14T16:10:55Z

hey @Comui520 thanks for this. I believe the fix should be at the evaluate_chunk boundary instead however. The None is important because it does not mislead the user that the evaluation cost is indeed 0. Could you make that fix, and leave the other models alone? Thanks!

- Register deepseek-v4-flash and deepseek-v4-pro in DEEPSEEK_MODELS_DATA - Guard evaluate_chunk and a_evaluate_chunk against None cost from calculate_cost() when model pricing is unknown - Guard _generate_schema, _a_generate_schema, _generate, _a_generate against None cost in the same way - Raise DeepEvalError when all document pipelines fail instead of silently returning empty contexts - Raise DeepEvalError when generated contexts list is empty deepseek-chat and deepseek-reasoner will be deprecated on 2026-07-24 in favor of v4-flash and v4-pro.

Comui520 · 2026-06-15T01:17:11Z

Done — moved the fix to the boundaries, left calculate_cost and all other models untouched. Here's what changed:

Files changed (3)

File	Change
`deepeval/models/llms/constants.py` (+16)	Register `deepseek-v4-flash` and `deepseek-v4-pro` in `DEEPSEEK_MODELS_DATA`
`deepeval/synthesizer/chunking/context_generator.py` (+23/-2)	Guard `evaluate_chunk` and `a_evaluate_chunk` against `None` cost; raise `DeepEvalError` when all document pipelines fail instead of silently returning empty contexts
`deepeval/synthesizer/synthesizer.py` (+29/-3)	Guard `_generate_schema`, `_a_generate_schema`, `_generate`, `_a_generate` against `None` cost; raise `DeepEvalError` when generated contexts list is empty

What the fix does

At every call site where cost returned by calculate_cost() is added to a running total, added a None guard:

# Before (crashes when pricing is unknown)
self.total_cost += cost

# After
if cost is not None:
    self.total_cost += cost

This prevents the TypeError that was silently swallowed by the broad except Exception in generate_contexts(), which previously caused empty goldens with no error surfaced.

The context generator and synthesizer now also raise explicit DeepEvalError instead of silently returning empty results when all documents fail or no contexts are produced.

Why this approach

calculate_cost() returning None is the correct contract — "unknown price" ≠ "zero price"
Guards at the boundary prevent arithmetic errors without misleading users about costs
No changes to other model providers or tests

hey @Comui520 thanks for this. I believe the fix should be at the evaluate_chunk boundary instead however. The None is important because it does not mislead the user that the evaluation cost is indeed 0. Could you make that fix, and leave the other models alone? Thanks!嘿，谢谢。但我认为修复应该是在 evaluate_chunk 边界上，不过。 None 很重要，因为它不会误导用户认为评估成本确实是 0 。你能做这个修复，而让其他模型保持原样吗？谢谢！

Comui520 · 2026-06-15T05:45:32Z

@penguine-ip , appreciate your feedback. The None handling is now placed at the evaluate_chunk boundary, with other provider code untouched. Error visibility and new DeepSeek model support are also included.

Comui520 force-pushed the fix/deepseek-v4-support branch 3 times, most recently from 38b6cae to 4d4664e Compare June 9, 2026 11:31

kimnamu mentioned this pull request Jun 13, 2026

fix(bedrock): honor user-supplied per-token costs in AmazonBedrockModel #2753

Merged

penguine-ip added the awaiting code fix label Jun 14, 2026

Comui520 force-pushed the fix/deepseek-v4-support branch from 4d4664e to a7708cb Compare June 15, 2026 01:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: add deepseek-v4 models, fix calculate_cost, and improve error ha…#2736

fix: add deepseek-v4 models, fix calculate_cost, and improve error ha…#2736
Comui520 wants to merge 1 commit into
confident-ai:mainfrom
Comui520:fix/deepseek-v4-support

Comui520 commented Jun 9, 2026

Uh oh!

vercel Bot commented Jun 9, 2026

Uh oh!

penguine-ip commented Jun 14, 2026

Uh oh!

Comui520 commented Jun 15, 2026

Uh oh!

Comui520 commented Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Comui520 commented Jun 9, 2026

Summary

Details

Problem

Changes

Why this matters now

Uh oh!

vercel Bot commented Jun 9, 2026

Uh oh!

penguine-ip commented Jun 14, 2026

Uh oh!

Comui520 commented Jun 15, 2026

Files changed (3)

What the fix does

Why this approach

Uh oh!

Comui520 commented Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants